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Abstract 

If one places N cities randomly on a continuum in an unit area, extensive numerical results 
and their analysis (scaling, etc.) suggest that the best optimized travel distance per city 
becomes Ie — 0.72/^/N for the Euclidean metric, and Im — 0.92/v^/V for the Manhattan 
metric. The analytic bounds, we discuss here, give 0.5 < IeVN < 0.92 and 0.64 < ImV~N < 
1.17. When the cities are randomly placed on a lattice with concentration p, we find (with 
iV = p for unit area of the country) Ie\Pp and Im\[v~ vary monotonically with p: Ie^/v = 
Im-Jv — 1 for p = 1, and Ie\[v — 0.72 and Im\Jv — 0.92 as p — > 0. The problem is trivial 
for p = 1 but it reduces to the continuum TSP for p — > 0. We did not get any irregular 
behaviour at any intermediate point, e.g., the percolation point. The crossover from the 
triviality to the NP- hard problem seems to occur at p < 1. 



* K. C. Kar Memorial Lecture, 1999 (to be published in Indian J. of Theo. Phys., Calcutta). 



1. Introduction 

In everyday life we face several complex problems, classified as combinatorial optimization 
problems, the solutions of which are of great practical importance. Research in this area 
tries to find different efficient techniques for finding the extremum (maximum or minimum) 
values of a function of many different independent variables [1-3]. 

The travelling salesman problem (TSP) is a simple example of a combinatorial optmiza- 
tion problem and perhaps the most famous one. Given a certain set of cities and the intercity 
distance metric, a travelling salesman must find the shortest tour in which he visits all the 
cities and comes back to his starting point. It is a non-deterministic polynomial complete 
(NP- complete) problem. NP problems are those for which a potential solution can be 
checked efficiently for correctness, but finding such a solution appears to take time which 
scales exponentially with the size N in the worst case. The completeness property of NP- 
complete problems means that if it is possible to find a deterministic algorithm that solves 
one NP- complete problem in polynomial time, then the other NP- complete problems could 
also be solved in polynomial time. 

In the TSP, the most naive algorithm for finding the optimal tour would have to consider 
all the (N — l)!/2 possible tours for N number of cities and check for the shortest of them. 
Working this way, the fastest computer available today would require more time than the 
current age of the universe to solve a case with about 30 cities. The typical-case behaviour 
is difficult to characterize for the the TSP though it is believed to require exponential time 
to solve in the worst case. For this reason the TSP serves as a prototype problem for the 
study of the combinatorial optimization problems in general. 

In the normal TSP, we have N number of cities distributed in some continuum space and 
we determine the average optimal travel distance per city Ie in the Euclidean metric (with 
Are = \JAx 2 + Ay 2 ), or l M in the Manhattan metric (with Ar M = \Ax\ + |Ay|). Since 
the average distance per city (for fixed area) scales with the number of cities iV as 1/y/N, 
we find that the normalized travel distance per city = IeVn or VLm = ImV^N become 
the optimized constants and their values depend on the method used to optimize the travel 
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distance. In section 2, we discuss some algorithms used to determine the optimal tour and 
find the values of the constants Qe and Qm for the optimized travel. In section 3, we present 
an analytic method to estimate the upper and lower bounds of Qe and Qm- 

In the lattice version of the TSP, the cities are represented by randomly occupied lattice 
sites of a two- dimensional square lattice; the fractional number of occupied sites being p 
(lattice occupation concentration). In this case the average optimal travel distance in the 
Euclidean metric Ie, and in the Manhattan metric Im, vary with the lattice concentration p. 
Then the normalised travel distance per city are defined as Q E = Ie \/p and Q M = Im ^Jp~ ■ 
In section 4, we study the variation of Qe and Qm, and the ratio Qm/Qe with p. Finally, 
we draw conclusions in section 5. 

2. Some heuristic algorithms 

The most naive method to obtain an approximate solution of travelling salesman problem is 
the "greedy" heuristic algorithm [1, 2]. Suppose we have a random arrangement of iV cities 
in a square (country) of fixed area (taken to be unity). Let us think of any tour to start- with 
and then make a local exchange of a pair of cities in the tour. We compute the new tour 
length and if it is lower than the previous one, then the greedy algorithm accepts the new 
tour as the starting point for further such modifications. The "Lin- Kernighan" algorithm 
[4, 5] considers local exchange between three or more cities. 

The essential drawback of such local search algorithms is the obvious one of getting 
stuck at a local minimum, where any local rearrangement in the tour does not improve the 
optimized tour length. The "simulated annealing method" [2, 6] is an ingenious method in 
analogy with the thermodynamic way of avoiding such local minima in free energy (glass 
formations) and achieving the global minimum of a many-body system by slow cooling or 
annealing. The rapid quenching of the system leads to the trapping of the system in a local 
minimum (or glass) state. The system cannot get out of it, since the Boltzmann probability 
to get out of the minimum drops to zero, as the temperature becomes zero due to quenching. 
This is similar to the greedy or other local search algorithms. In the annealing, the system 
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is slowly cooled so that as the system falls in a local trap, the finite Boltzmann probability 
(~ exp(E' — E)/kT, for trap energy E and barrier height E') allows the system to get out 
of the trap, maintaining a general flow to lower energy states as temperature decreases. 
Eventually the system anneals to the ground state at the lowest temperature. 

In the TSP case, one takes the total tour length L (= Nl) as the energy E and one 
introduces a fictitious temperature T. Initially one takes T very high such that the average 
total tour length L is much higher than the global minimum. The tours are then modified 
locally and the modified tours are accepted with probability ~ exp(AL/kT) where AL is 
the change in the tour length. In greedy algorithm the probability is unity for negative AL 
and it is zero for positive AL cases. Here, probability is non-vanishing even for AL positive 
as long as the temperature is nonzero! 

Simulated annealing and numerous heuristic generalizations of the local search algorithm 
optimize very effectively on small scales involving a small number of variables, but fail for the 
larger scales that require the modification of many variables simultaneously. To deal with 
the large scales, "genetic algorithms" [7] use a "crossing" procedure which takes two good 
configurations — "parents" , from a population and finds sub-paths that are common to the 
parents. It generates a "child" by reconnecting those sub-paths, either randomly or by using 
large parts of its parents. A population of configurations is evolved from one generation to 
the next using these crossings followed by a selection of the best children. However, this 
approach supposedly does not work well in practice since it is extremely difficult to produce 
two parents and cross them to make a child as good as them. This is a major drawback of 
the genetic algorithms and is responsible for their limited use. 

So far, careful analysis of the numerical results obtained indicates that He — 0.72 [8] for 
TSP on continuum. 

3. Some analytical results for the bounds for Q 

Although the TSP problem is a multivariable optimization problem (real number of variables 
~ N\ in an N city problem), we now look for an approximate analytical solution (upper 
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bound) by expressing the travel distance as a function of a single variable and optimizing 
the distance with respect to that variable [9]. As is obvious, the problem is trivial in one 
dimensional case where any directed tour will solve it. In two dimensions, one can again 
reduce it (approximately) to an one dimensional problem, where the square (country) is 
divided into strips of width W and within each strip, the salesman visits the cities in a 
directed way. The total travel distance is then optimized with respect to W. 

Let the strip width be W and the probability density of cities be p (— N for unit area). 
We have a city at (0, yi) [See Fig. 1]. The probability that the next city is between distances 
x and x + Ax, is pWAx. The probability that there is no city in the distance x = nAx, 
is (1 — pW Ax) n ~ e -(P Wx ), The probability that there is a city between y and y + Ay, is 
Ay/W. Hence the probability that there is no other city within distance y is (1 — y/W). 
The average distance between any two consecutive cities is therefore 

l E = 2 r f W V^T? pWdx e -^)§(l - X) . (i) 
Jx=0 Jy=o v W W 

The factor 2 arises to take care of the fact that y can be both positive and negative. We 
make the substitutions: u = pWx and v = y/W, so that 

J E = 2 [°° f 1 —J U 2 +p 2 W 4 v 2 e -«n _ v )dud,V . 
Ju=0 Jv=0 pW V 

We introduce two dimensionless quantities Qe = \fp Ie and W = v /p W, so that 

n E = ^- I™ f 1 \ju 2 + W 4 v 2 e- u (l -v)dudv . (2) 

W Ju=0 Jv=0 

Using the method of Monte Carlo integration to evaluate the above integral, we get the 
minimum Qe ~ 0.92 at normalized strip width W ~ 1.73 [See Fig. 2]. 

In the Manhattan metric the average distance between any two consecutive cities is 

Im = 2 r [ W (x + y)pWdxe-^%{\ - -|) . (3) 

Jx=0 Jy=0 W W 

As before we introduce u = pWx and v = y/W , so that 

l M = 2 / — (u + pW 2 v)e- u {\ -v)dudv , 

Ju=0 Jv=0 pW 
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and then introduce the dimensionless quantities Vt M = yfp l M and W = y/p W, so that 

9 roo rl 

Q M = ^ / f u + W 2 v)e~ u {l - v)dudv . (4) 

W Ju=0 Jv=0 

Using the method of Monte Carlo integration, we get the minimum VLm ~ 115 at the 
normalized strip width W ~ 1.73 [See Fig. 3]. 
Note that the relation 
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can be explained as follows. Let 

x = lEsm9 and u = IeCOs9. 



Then, 



Since (x) = (y), 



We have now 



Hence 



Im = x + V = Ie(cos9 + sin#) . 



2l E (cos6) 



( COS 0) = - 12 cos Odd = -[sinC /2 = - . 

71 JO IX ' IX 



4- 4 
Im = -Ie , or Cl M = -fi E . (5) 

71 71 

Let us now estimate the lower bound of the minimum travel distance per city. Let the 
distance between any two cities be denoted by /. Then the probability that there is a city 
between I and l+dl ~ (p— l)2nl dl ~ 2p7tl dl. Now, the probability that there is no other city 
in the distance I ~ (1 - tt/ 2 )^ 2 ~ e^- 2 ^ 1 ' 2 ~ e^' 2 . Therefore, P{l)dl = {2 P 7rl)e- pnf 'dl. 
Note that / P{l)dl = 1. Hence the average distance is 

_ /-00 /■OO „ i 1 

l E = / lP(l)dl = 2 P 7T / l 2 e~ npl dl = . (6) 

Jo Jo 2 y/p 

Therefore, the lower bound for £l E — 1/2 . The lower bound for VL M can then easily be 
estimated to be 2/tt in a similar manner. 
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4. The TSP on randomly diluted lattices 

The lattice version of the TSP was first studied by Chakrabarti [10]. In the lattice version 
of the TSP, the cities are represented by randomly occupied lattice sites of a two- dimensional 
square lattice (Lx L), the fractional number of sites occupied being p (lattice concentration) 
[11-13]. In this case, the average optimal travel distance in the Euclidean metric Ie, and in 
the Manhattan metric T M , vary with the lattice concentration p. We intend to study in this 
case the variation of the normalised travel distance per city, Qe — Ie\/P an d &m — ^My/P, 
with the lattice occupation (city) concentration p. 

We generate the randomly diluted lattice configuration following the standard Monte 
Carlo procedure for 64(= N) randomly positioned (on the lattice) cities. We vary the lattice 
size from (8 x 8) to (48 x 48) so that the lattice concentration p varies from 1.000 to 0.028. 
For each such lattice configuration, the exact optimum tour [See Fig. 4] is obtained with the 
help of the GNU tsp_ solve [14]. We then calculate Ie and Im- At each lattice concentration 
p, we take different lattice configurations and then obtain the averages, Ie and Im- We then 
determine VLe = Ie y/p and VLm = Im \fp and study the variation of Qe and Qm, and of 
the ratio Qm/^e with p. We find that Qe has monotonic variation from 1 (for p — 1) to 
a constant ~ 0.79 (for p — > 0) and Qm has monotonic variation from 1 (for p — 1) to the 
constant 1.01 (for p — > 0) respectively. We believe, with bigger N the value of Qe eventually 
reduces to about 0.72 as in continuum TSP. Results for higher values of N (~ 100) [15] indeed 
suggest the same. The ratio VL m /Qe changes from 1 to 1.26 (~ 4/V), as p varies from 1 to 
[See Fig. 5] . We note that the TSP on randomly diluted lattice is certainly a trivial problem 
when p = 1 (lattice limit) as it reduces to the one-dimensional TSP (the connections in 
the optimal tour are between the nearest neighbours along the lattice; Hamiltonian walks). 
However, it is certainly hard at the p — > (continuum) limit. It is clear that the problem 
crosses from triviality (for p — 1) to the NP- hard problem (for p — > 0) at a certain value of 
p. It seems the transition occurs at p < 1. This requires further investigation. 
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5. Conclusions 

If one places N cities randomly on a continuum in an unit area, the best numerical 
results and their analysis (scaling, etc.) suggest that the best optimized travel distance per 
city becomes Ie — 0.72/ 'VN for the Euclidean metric and Im — 0.92/y/N for the Manhattan 
metric. The analytic bounds we discussed in section 3, gives Qe(— IeVN) < 0.92 and 
^m(= ImVN) < 117. When the cities are randomly placed on a lattice with concentration 
p, as discussed in section 4, we find (with N = p for unit area of the country) that 0. E {p) 
and Ow(p) are monotonically varying with p. The problem is trivial for p — 1 where 
^e(p) = Qm(p) = 1 and it certainly reduces to the continuum TSP discussed before for 
p — > (il E — 0.72 and VL M ~ 0.92; although we observed higher values, viz., VL E — 0.79 
and M ~ 1.01, since is not sufliently large). The variations of Vt with p are found to be 
monotonic without any irregular behaviour at any intermediate point like the percolation 
point, etc. The crossover from the triviality to the NP- hard problem seems to occur at 
p < 1. However, this requires further investigation. 

Acknowledgement : We are grateful to R. B. Stinchcombe, A. Percus and O. C. Martin 
for very useful comments and suggestions. 
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Figure captions 

Fig. 1 : Calculating the average distance between two nearest neighbours along a strip of 
width W. 

Fig. 2 : Plot of lEy/p against W \fp from eqn. (2). 
Fig. 3 : Plot of Im^/P against W^/p from eqn. (4). 

Fig. 4 : A typical optimized tour for TSP on dilute lattice in the Euclidean metric for 
N = 64 cities. 

Fig. 5 : Plot of Q E , Vt M and Vt M /Vt E against p for TSP on dilute lattice, obtained using 
the optimization programs (exact) for iV = 64 cities (fixed). The error bars are due to 
configuration to configuration variations. 
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